Conditional Sparse Linear Regression

نویسنده

  • Brendan Juba
چکیده

Machine learning and statistics typically focus on building models that capture the vast majority of the data, possibly ignoring a small subset of data as “noise” or “outliers.” By contrast, here we consider the problem of jointly identifying a significant (but perhaps small) segment of a population in which there is a highly sparse linear regression fit, together with the coefficients for the linear fit. We contend that such tasks are of interest both because the models themselves may be able to achieve better predictions in such special cases, but also because they may aid our understanding of the data. We give algorithms for such problems under the sup norm, when this unknown segment of the population is described by a k-DNF condition and the regression fit is s-sparse for constant k and s. For the variants of this problem when the regression fit is not so sparse or using expected error, we also give a preliminary algorithm and highlight the question as a challenge for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Joint estimation of sparse multivariate regression and conditional graphical models

Multivariate regression model is a natural generalization of the classical univariate regression model for fitting multiple responses. In this paper, we propose a highdimensional multivariate conditional regression model for constructing sparse estimates of the multivariate regression coefficient matrix that accounts for the dependency structure among the multiple responses. The proposed method...

متن کامل

Conditional Sparse Coding and Multiple Regression for Grouped Data

We study the problem of multivariate regression where the data are naturally grouped, and a regression matrix is to be estimated for each group. We propose an approach in which a dictionary of low rank parameter matrices is estimated across groups, and a sparse linear combination of the dictionary elements is estimated to form a model within each group. We refer to the method as conditional spa...

متن کامل

Conditional Sparse Coding and Grouped Multivariate Regression

We study the problem of multivariate regression where the data are naturally grouped, and a regression matrix is to be estimated for each group. We propose an approach in which a dictionary of low rank parameter matrices is estimated across groups, and a sparse linear combination of the dictionary elements is estimated to form a model within each group. We refer to the method as conditional spa...

متن کامل

A non-parametric conditional factor regression model for high-dimensional input and response

In this paper, we propose a non-parametric conditional factor regression (NCFR) model for domains with high-dimensional input and response. NCFR enhances linear regression in two ways: a) introducing low-dimensional latent factors leading to dimensionality reduction and b) integrating an Indian Buffet Process as a prior for the latent factors to derive unlimited sparse dimensions. Experimental ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017